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ABSTRACT 

The physical importance of the apparent discrepancy between the detections 
by pre-BATSE missions of absorption lines in gamma-ray burst spectra and 
the absence of a BATSE line detection necessitates a statistical analysis of this 
discrepancy. This analysis requires a calculation of the probability that a line, 
if present, will be detected in a given burst. However, the connection between 
the detectability of a line in a spectrum and in a burst requires a model for the 
occurrence of a line within a burst. We have developed the necessary weighting 
for the line detection probability for each spectrum spanning the burst. The 
resulting calculations require a description of each spectrum in the BATSE 
database. With these tools we identify the bursts in which lines are most likely 
to be detected. Also, by assuming a small frequency with which lines occur, we 
calculate the approximate number of BATSE bursts in which lines of various 
types could be detected. Lines similar to the Ginga detections can be detected 
in relatively few BATSE bursts; for example, in only ~ 20 bursts are lines 
similar to the GB 880205 pair of lines detectable. Ginga reported lines at ~ 20 
and ~ 40 keV whereas the low energy cutoff of the BATSE spectra is typically 
above 20 keV; hence BATSE's sensitivity to lines is less than that of Ginga 
below 40 keV, and greater above. Therefore the probability that the GB 880205 
lines would be detected in a Ginga burst rather than a BATSE burst is ~ 0.2. 
Finally, we adopted a more appropriate test of the significance of a line feature. 

Subject headings: gamma-rays: bursts — methods: statistical 
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1. INTRODUCTION 

The continued absence of a line detection in the gamma-ray burst spectra accumulated 
by the Burst and Transient Source Detector (BATSE) on the Compton Gamma-Ray 
Observatory (CGRO) (Palmer et al. 1994; henceforth Paper I) has led us to continue not 
only the search for hnes in the BATSE data (Briggs et al. 1996) , but also our study of the 
detectability of lines by the BATSE detectors and the statistical imphcations of the current 
results. In particular, we are evaluating the consistency between the BATSE observations 
and those of previous missions, particularly those of Ginga. These calculations assume the 
BATSE detectors function properly and that our models of their performance are accurate, 
assumptions which we test continuously (Paciesas et al. 1997; Preece et al. 1997). Here we 
fill a major gap in our statistical methodology and implement it for the BATSE data. 

The description of our statistical methodology is clearest using conditional probabilities 
and their associated notation. Thus p{a \ b) means the probability of proposition a given 
proposition b. A bar over a proposition denotes the negation of that proposition. Since 
we need to differentiate between quantities which refer either to a burst as a whole or to 
a specific spectrum accumulated over a portion of the burst, we use the convention that 
roman indices specify spectra and greek indices identify bursts. For example, /^r represents 
the proposition that a line exists in the crth burst, while li is the proposition that a 
line is present in the ith spectrum; technically the burst within which the spectrum was 
accumulated should also be indicated (e.g., l^ji), but the burst will be understood from the 
context. As a reminder that these probabilities rely on our understanding of the detectors 
and gamma-ray bursts we include as one of the givens the proposition /, which represents 
our model of the detector response, our parameterization of the burst continuum, etc. Our 
calculations can be seriously in error if our assumptions expressed by / are incorrect. For 
example, we use the "GRB" spectral function (Band et al. 1993) to model the continuum, 
but this spectral shape is not based on the source physics, and therefore must be incorrect 
at some level of accuracy. 

Our analysis of the possible line content of a burst sample is based on a hierarchy of 
probabilities. Ultimately we want the probabihty p{D \ HI) of obtaining the observed data 
(proposition D) assuming hypothesis H (Band et al. 1994, hereafter Paper II). Thus D 
might represent the statement that no hnes have been detected in the BATSE database, 
and H might be the hypothesis that lines exist and that we are modehng BATSE correctly. 
The information which might be represented by / and H overlaps; in general H should 
include the information which differs when hypotheses are compared. Also known as the 
likelihood for H, p{D \ HI) can be used in measures of the consistency between BATSE 
and previous detectors. Our methodology does not result in p{D \ HI) directly, but rather 
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in p{D I fHI) where / is the hne frequency, the probabihty that a hne is found in a 
given burst. When necessary, this dependence on / is removed by the Bayesian process 
of "marginahzation." Since bursts are presumably independent events, p{D \ fHI) is the 
product of the probabihties of obtaining the observed detections or nondetections in each 
burst. Thus, for the Nb BATSE bursts in which no hues have been detected 

Nb 

p{D\fHI)=ll{l-p{L„\fHI)) (1) 

<T=1 

where L^. is the proposition that a hne has been detected in the crth burst, and therefore 
p{L^ I fHI) is the probabihty of detecting a hne in this burst. If present, a hne may persist 
over a time range shorter than the burst duration, and will be found in the ith spectrum 
accumulated during the burst. Therefore p{L„ \ fHI) will be a function of the probability 
p{Li I fHI) of detecting the line in the ith spectrum; Li is the proposition that the line 
was detected in the ith spectrum. The connection between p{Lcr \ fHI) and p{Li \ fHI) is 
presented in this paper. The line detection may be real or false, and therefore (Band et al. 
1995, hereafter Paper III) 

p{Li I fHI) = p{Li I kHI)p{h I fHI) + p{U I liHI)p{k I fHI) , (2) 

where li is the proposition that the hne is present in the Zj. Thus | UHI) is the 
probability of detecting a line in a spectrum and p(Lj | UHI) is the probability of a "false 
positive." Currently we assume that our detection criteria are stringent enough to make 
the false positive rate negligible, an assumption we will investigate in the future. In this 
paper we also discuss the database describing the BATSE spectra necessary to calculate 
p(Lj I lifHI); we also present some results of utilizing this database. 

The probability p{Li \ UHI) of detecting a line which is present in the ith spectrum 
is the foundation of our methodology. If they exist, lines are undoubtedly characterized 
by a currently unknown distribution of energy centroids, line widths, intensities (e.g., 
equivalent widths), and perhaps other parameters. The detectability of each line type 
can be considered separately; we generally have been using the lines reported by Ginga 
in the SI segment of GB 870303 (Graziani et al. 1992) and in GB 880205 (Murakami et 
al. 1988) as archetypes, although a generic set of lines can be used. We find (Paper III) 
that the detectability of a given line is a function of the strength of the continuum (i.e., the 
signal-to-noise ratio — SNR) and the angle between the detector normal and the burst (the 
burst angle). This study of line detectability shows that BATSE would have detected the 
lines reported by Ginga, assuming of course that BATSE functions as modeled. 

Next we need p{L(,- \ l^HI), the probability of detecting a line somewhere in the crth 
burst given that the line is indeed present, which requires a relationship between the 
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presence of a line in the aih burst and its presence in the ith spectrum within this burst. 
Thus we may calculate that there is a very high probability of detecting a line in a given 
spectrum if it is present {p{Li \ liHI) ~ 1), but may conclude that even if a line is present 
somewhere in the burst, it is most likely not in the spectrum in question {p{li \ laHI) <^ 1). 
For example, based on empirical evidence or theoretical prejudice we may believe that lines 
do not persist for the length of time over which the spectrum was accumulated. The few 
line detections from all the burst missions are insufficient to map out the line distribution 
(e.g., energies, intensities, widths, persistence times), and therefore we must model the 
probability p{li \ laH) that a line occurs in the ith spectrum, regardless of whether it is 
detectable. 

With p{L^ I IcrHI) for the bursts observed by different missions we can evaluate the 
consistency between these missions. Both BATSE and Ginga provided sufficient data to 
carry out such an analysis. We developed a number of measures of the consistency between 
these two missions using both standard "frequentist" (Paper I) and Bayesian statistics 
(Paper 11). In addition, values of p{Lfj \ laHI) for the bursts in the BATSE database can 
be used to identify the most promising bursts for further analysis. We first apply new line 
search techniques (Schaefer et al. 1994; Briggs et al. 1996) to bursts in which lines are most 
likely to be detected. 

The statistical analysis outlined above requires a characterization of all the spectra 
from BATSE and Ginga, and measures of line detectability for both instruments. Paper III 
provides the line detection probabilities for the BATSE Spectroscopy Detectors (SDs), the 
relevant BATSE detectors. Below we describe the database of BATSE spectra created to 
characterize the BATSE bursts. Fenimore et al. (1993) performed a prehminary evaluation 
of the Ginga data for the GB 880205 line set; a more extensive extraction of the necessary 
Ginga information is planned. 

Here we consider a "real" line to be a true feature in the spectrum that arrives at the 
detector; most likely a real feature was emitted by the burst, although the feature may 
possibly have been imposed on the spectrum by astrophysical processes between the burst 
source and the detector. A "detection" is a feature which satisfies the detection criteria 
and is therefore considered to be a real line. Note that we treat the detection of a line as 
a binary conclusion: a feature is either considered to be a detected line or it is ignored in 
subsequent analysis. Unlike frequentist statistics, in which a hypothesis is either true or 
false, Bayesian statistics (which is used in Paper II) allows our confidence in a hypothesis 
to be quantified via a probability that the hypothesis is true. In principal we could develop 
a formalism which permits a fractional detection through the probability that a feature 
is real. We could then develop a methodology of using the entire spectral database to 
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tease out the line distribution from all the spectral bumps and wiggles, most of which are 

undoTibtcdly fluctuations, but a small fraction of which might result from an underlying 
distribution of real lines. Specifically, deviations from the distribution of line significances 
expected from mere fiuctuations could be used to estimate the distribution of true spectral 
lines; this assumes that the fiuctuation distribution can be estimated accurately. However, 
most scientists are more comfortable working with definite detections and nondetections, 
and that is the route we have taken. 

Our detection criteria are 1) that a spectral feature is significant in the spectrum 
from one detector, and 2) that all spectra from the detectors which observed the burst 
are consistent. Until recently significance was defined using the F-test. However, as 
discussed in the appendix, the F-test is appropriate when the uncertainties on the measured 
quantities — here the counts in each channel — are unknown (Eadie et al. 1971), which is not 
the case here. We have therefore adopted a maximum hkelihood ratio test which uses Ax^ 
as the relevant statistic (D. Lamb, private communication, 1995). In practice, these two 
tests usually give comparable results. 

This paper begins with the method for connecting the probabilities of detecting a line 
in a given spectrum and anywhere in a burst (§2). The resulting methodology requires the 
line detection probability for every burst spectrum accumulated by BATSE; a database 
of parameters describing these spectra is required to calculate these probabilities (§3). 
This data is used to rank the BATSE bursts by the maximum signal-to-noise ratio of any 
spectrum accumulated during a burst. This database is also utilized to find the number 
of bursts in which different line types could be detected under the assumption of a small 
fine occurrence frequency (§4); this number is crucial to the study of consistency between 
the BATSE and Ginga observations (§5). In the first appendix we discuss the maximum 
likelihood ratio test which we have adopted to evaluate the significance of an observed line 
feature. The second appendix lists the large number of symbols used in this work. 

2. PROBABILITY A LINE IS PRESENT IN A GIVEN BURST 

Here we consider the probabihty of detecting a fine in a single burst, and therefore 
we suppress the (greek) indices specifying the burst. Also, we consider the probability of 
detecting a line of a given type specified by parameters such as energy centroid, intrinsic 
width and intensity (but not the time over which the line is present); the resulting 
calculation must be done for each line type. BATSE accumulates a series of consecutive 
spectra from four different SDs (the Spectroscopy detector High Energy Resolution, 
Burst — SHERB — data type); we refer to these basic spectra which provide the finest time 
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resolution available as SHERB spectra. We assume there are N SHERB spectra for a 
given detector across the burst from which it is possible to constiTict N{N + l)/2 different 
averaged spectra composed of consecutive SHERB spectra; quantities describing these 
averaged spectra are specified by roman indices (e.g., k means a line exists in the ith 
averaged spectrum). 

The probability of detecting a line in the burst as a whole is the probability of detecting 
a line in at least one of the spectra which can be searched, 

JV(JV+l)/2 

p{L\fHI) = l- n [l-p{Li\fHI)] (3) 

i=l 

(the probability of at least one detection is one minus the probability of no detections). 
As will be discussed below, the line frequency / is the probability that a line is present 
somewhere in the burst, / = p{l \ HI). A line detection in a given spectrum results either 
from the detection of a real line or a spurious detection (i.e., a "false positive"), 

p{U\fHI) = p{L,\kHI)p{k\fHI)+p{L,\kHI)p{k\fHI) , 

= p(L, I kHI)p{k I fHI) + p{L, I kHI) (1 - p{k I fHI)) , (4) 

where we use the fact that p{li \ fHI) and p{li \ fHI) are exhaustive. Paper III calculated 
p{Li I liHI) for the BATSE SDs, while the probability of a spurious detection p(Lj | UHI) 
will be studied further, but is clearly dependent on the detection threshold. 

Our focus here is p{li \ fHI)., which is a statement of how lines occur in burst spectra. 
Is the probability that a line occurs in a burst the same for all bursts, or does it depend on 
duration, spectral hardness or other burst properties? Do lines persist for a long time or 
for short intervals? Unfortunately, since there have been very few detections, we know very 
little about p{li \ fHI). Therefore, we have to construct reasonable models of p{li \ fHI) 
which we will use for further calculations. 

Let dp{l I ti)te fHI) — g{tb, te)dtbdte be the probability density for a line beginning at tf, 
and ending at te. If we assume that the probability depends only on the time a line persists, 
and does not favor the beginning or end of the burst, then g{th,te) will depend only on the 
persistence time tg — tb, i.e., g{th,te) = g{te — tb)- Since the data consist of discrete spectra, 
we cannot isolate the spectrum over the precise interval during which a line is present (if 
such exists since the line intensity may vary). Instead, the line will be attributed to a 
particular sum of consecutive SHERB spectra with an accumulation period overlapping 
the time the line was actually present; conversely, a spectrum summed from a number of 
SHERB spectra may show lines with a variety of beginning and end times. The probability 
that a line begins between t^i and t{,2 and ends between t^i and ie2, and would be attributed 
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to the ith spectrum accumulated between tj and {tbi < tj < tb2 < ^ei < ^fe ^ te2), is 



p{h I fHI) = 1 - exp 



/ (iife / dteg{tb,te) 



(5) 



the probabihty of a hne existing in the burst is 



p{l I fHI) = 1 - exp 



- / dtb dteg{tb,te) 

^ti, 



(6) 



where T is the burst duration. These expressions are derived from 1 — p{l\ fHI) — 
Hj 1 - p{lj I fHI) = exp[E,- ln(l - p{lj I fHI))] = exp[- E,p(i,- 1 fHI)], where 
ln(l — p{lj I fHI)) — —p{lj I /i?/) is vahd because p{lj \ fHI) — g{tb,te)dtbdte (i.e., 
p{lj I /i?/) is small). Finally, J2jP{lj \ fHI) — J g{tb,te)dtbdte. In practice, if the ith 
spectrum begins at tj and ends at tk, we use tbi — {tj-i + tj)/2, tb2 — {tj + ^j+i)/2, 
tei = (^fc-i + ^fc)/2 and te2 = {tk + ^fe+i)/2, with a somewhat more complicated expression 
for a single SHERB spectrum. 

As examples, we consider three different functional forms for g{te — tb). In each case 
there are two major variants. The first variant (eqs. [7], [9], and [11] below) assumes that 
g is the same function of the persistence time — tb with the same normalization for all 
bursts, and thus the line frequency varies from burst to burst (i.e., lines arc more likely to 
occur in long bursts). The second variant (eqs. [8], [10], and [12] below) assumes the line 
frequency is the same for all bursts and therefore the normalization of g varies from burst 
to burst. In practice we use the second case. 

Model 1: g{te — tb) = c. If c is the same for all bursts then the first variant of this 
model is 



p{li I CHI) = 1 - exp [-C (^62 - tbi) {te2 " ^el)] , 

p{l\cHI) = / = l-exp[-cTV2] , (7) 

where < c < oo. Note that / increases with the duration T. On the other hand, if 
p{l I cl) = f for each burst then the second variant is 



P{k I fHI) = 1 - (1 - /)2(*62-t6l)(te2-tel)/T2 ^ 

where < / < 1. Note that there is no dependence on the persistence time tg 



(8) 



tb. 



Model 2: g{te — tb) — c6^exp[— 6(te — tb)]. This model would result from a sequence 
of independent line occurrences, that is, the probability of the line occurring in any given 
time interval does not affect the probability of its presence in the next time interval. If c is 
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constant for all bursts then 

p{li I cHI) = 1 — cxp 
p{l I ci?/) = f — 1 — exp 

where < c < oo. If we assume p{l \ cHI) — f then 

p{li I /iy/) = 1 - (1 - y)(e'"f2-e''*i.i)(e-We2-e-^'ei)/((l-6r)-e-^^) 



(e-^^ - (1 - 6r)) 



-c e 



(9) 



(10) 



where < / < 1. In this case there is a strong dependence on the persistence time te — h 
since (e^^^'^ - e***") (e-''*'^^ _ e'^^'^''^ = exp [-b {te2 - ^62)] (l - e-''(*''2-t6i)^ _ g6(te2-tei)^_ 

Model 3: git^ — th) = c(te — ti,)'*^. Assuming c is constant for all bursts 

C ((^62 — hlf ^ — (te2 — h2)^ ^ — {tei — tfel)^ ^ + (^el " ^62)^ 



p{li I cHI) 



1 — exp 



(2-6)(l-6) 

p{l\cHI) = / = i_exp[-cr2-V(2-6)(l-6)] , (11) 
where once again < c < oo. Requiring p{l \ cHI) = f for all bursts leads to 

p{li I fHI) = 1 - (1 - /)((*e2-*w)'"''-(*e2-*i'2)^"''-(*ei-*i'i)^"''+(*ei-*f'2)'"'')/r2-'' ^ ^-^2) 



where as usual < / < 1. Here there is also a strong dependence on the persistence time 
te — tb- 



3. DATABASE 

In Paper III we found that the probability p{Li \ UHI) of detecting a line in a spectrum 
was a function of the spectrum's SNR and the burst angle. Therefore we need these 
quantities for each spectrum from all the detectors for which there are data for a given 
burst. Because Ginga reported lines at ~ 20 keV and ~ 40 keV, we use SNR calculated 
between 25 and 35 keV. Thus the SNR measures the strength of the continuum in the 
energy range of interest which should mitigate the effect of different shape continua. 

We would like to search spectra with arbitrary beginning and end times, but the 
telemetry only provides spectra with discrete beginning and end times. Our search is meant 
to find the combination of consecutive SHERB spectra in which a candidate feature has 
the greatest significance. Thus, if N SHERB spectra span a burst, we need to consider 
N{N + l)/2 spectra. However, the database does not need to store parameters for all 
N{N + l)/2 possible spectra since they can be calculated from a smaller set of data. Here 
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we assume that the burst angle and background count rate Rb are constant for the entire 
burst for the detector providing the SHERB spectra, a reasonable assumption since the 
burst durations are usually less than 100 s (the time scale over which the background rate 
might change significantly enough to affect our results; the burst angle will change on much 
longer time scales). The SNR for each possible spectrum can be calculated from the counts 
and accumulation time for each SHERB spectrum. Thus 

^ 7 ^^^^^ (13) 

where Cj is the number of detected counts summed over all the SHERB spectra of which 
the ith spectrum consists, Rb is the background count rate, Atj is the time over which 
the spectrum was accumulated (i.e., the sum of the accumulation times of the constituent 
SHERB spectra), and Ai5 is the size of the energy range [AE ~ 10 keV). Both Rb and 
Ci are accumulated over /S.E. The factor of AE'^^^ converts the SNR from a ratio using 
the counts over an energy range {AE will vary in size from detector to detector and burst 
to burst) to a ratio using the counts per keV. Note that a livetime correction is not made. 
Thus the database need contain parameters only for each SHERB spectrum, as discussed 
in detail below. 

For a burst to be included in our database it had to have a peak count rate in 
the Large Area Detectors (LADs) over ~7,500 in the 50-300 keV energy band; of 
the 1550 bursts on which BATSE triggered between 1991 April and 1996 May, 297 
met this criterion. After identifying the channels between 25 and 35 keV, we extracted 
the number of counts in these channels for each SHERB spectrum for all the detectors 
that provided data. The background counting rate is the time average from a series of 
SHERB spectra after the burst, if available, and SHER spectra (Spectroscopy detector 
High Energy Resolution — background spectra accumulated when BATSE is not in burst 
mode) before and after the burst, if necessary. Calculating higher accuracy backgrounds 
is unnecessary for our purposes since here we only need a measure of the strength of the 
burst, not an accurate background-subtracted spectrum for spectral fitting. In some cases 
the calculated background was clearly too high — indicated by a large number of negative 
background-subtracted count rates — or low — found by inspecting weak bursts with large 
SNRs. Incorrect background rates were recalculated, often using stretches of background 
in the middle of, or just after, a burst. A burst for which the SNR is sensitive to the 
background level is usually too weak to harbor detectable spectral lines. Spectra from 
all detectors were included in the database, but we ignored data from detectors set at 
low gain or with burst angles greater than ~ 85°: low gain detectors have a low energy 
cutoff Eiow above the energies at which lines have been observed, and the spacecraft shields 
the detectors for very large burst angles. Line detectability depends on the energy range 
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covered; a line at 20 keV will not be detected if the spectrum begins at 20 keV. Therefore 
we also characterized each spectrum by its low energy edge £^iow) which we define as the 
upper end of the SLED (an electronic artifact just above a spectrum's true low energy 
cutoff — Band et al. 1992). The database therefore consists of the following data for each 
detector for each burst: the time interval over which the SHERB spectra were accumulated; 
the number of counts in the 25-35 keV range for each SHERB spectrum; and additional 
information for each burst-detector pair such as the burst angle, the energy Eio„ of the 
upper end of the SLED, and the exact energy width AE of the 25-35 keV range. 

The product of the methodology and database described above is the probability for 
each burst of detecting a line if present P{L \ IHI). The primary purpose of this probability 
is to assess the consistency between BATSE and other missions, and to estimate the 
frequency with which lines occur. However, this probability can also be used to identify the 
bursts in which hues are most detectable. Our search should therefore focus on those bursts. 
As an example, we characterized each burst by the maximum SNR for any spectrum during 
the burst. Figure 1 presents the cumulative distribution. Since the gain, and thus the 
energy range included in the spectrum, varies from burst to burst and detector to detector, 
we show distributions by maximum -Eiow Thus a line at 20 keV would be detectable in 
those bursts with a detector for which i^iow < 15 keV. From Paper 111 we find that the 
GB 880205 hue set (19.4 and 38.8 keV) would have been detected half the time by BATSE 
for SNR~ 7. The hne at 38.8 keV appears to determine the detectability of this hne set 
in the BATSE spectra, and therefore we require £^iow < 25 keV. We see that the hne at 
~ 40 keV would have been detectable in the highest SNR spectrum in about 65 bursts. 



4. SIMPLIFIED LIKELIHOOD CALCULATION 

The Bayesian consistency measures and related quantities require p{D \ fHI) , the 
probability of obtaining the observed results D assuming a hypothesis H about bursts and 
the detectors (Paper II); p{D \ fHI) is also known as the likehhood for / and H. Thus 
D might represent the absence of a BATSE line detection or the Ginga line detections 
in specific bursts, while H might stand for the hypothesis that lines exist, the BATSE 
detectors are modeled correctly, and the BATSE and Ginga results are consistent. In 
our formulation we explicitly separate out the line frequency /. A burst with a detection 
contributes to p{D \ fHI) a factor of p{Lcr \ fHI), while a burst with no line detection 
contributes 1 — p{L„ \ fHI). Note that as before roman and greek indices specify spectra 
and bursts, respectively. Thus, if there are line detections in na bursts in a database of Nb 
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bursts then 



(14) 

where the detections have been placed at the beginning of the database. The hne frequency 
/ is not a quantity of interest to the consistency issue, and therefore it is "marginahzed," 



p{D \HI) = j dfpif I HI)p{D I fHI) . 



(15) 



The "prior" for /, p{f \ HI), is our assessment of the hkely value of / before the data D 
were obtained. In general we assume that / could be any value between and 1, and 
therefore p{f \ HI) = 1. 

We saw in §2 that p{li \ fHI) = 1 - (1 - /)t* (e.g., eqs. [8], [10] or [12]). Consequently 

the probabilities of detecting a line in a given burst p(Lcr \ fHI) and of obtaining the 
observed database p{D \ fHI) are complicated functions of /. Thus the integral over / in 
eq. (15) will be a time-consuming numerical calculation since information from all the bursts 
must be included in evaluating the integrand at each value of /. However, we can make 
some simplifying assumptions. First, we assume the false positive probability p{Li \ kHI) is 
very small and can be neglected. Second, the absence of a detection in the BATSE dataset 
indicates that the line frequency / is probably small, and therefore 



p{h I fHI) = 1 - (1 - /)^' ~ 7,/ 



Consequently: 



p{Li\fHI) 
piKlfHI) 

p{D\fHI) 
M{Ld I IdHI) 



p{Li I liHI)'yif , 

N^(N^+l)/2 

J2 p{u I kHi)^i 



i=l 



f , 



'Nb N^{N^ +l)/2 

g J2 P{L, I kHI)^, 

<T=1 1=1 

Nb N„{N^+1)/2 

g ^ p{L,\kHI)j, . 

<T=1 1=1 



f 



(16) 

(17) 
(18) 

(19) 

(20) 



No- is the number of SHERB spectra spanning the cith burst. We approximated p{D \ fHI) 
in eq. (19) for the case of no detections, which is currently relevant for BATSE (i.e., = 
in eq. [14]). The quantity M^Lu \ IdHI) is the sum of each burst's detection probability; 
thus M{Ld I IdHI) will be nearly equal to the number of bursts if all bursts are uniformly 
strong, whereas weak bursts will not contribute to this statistic. Since M{Ld \ IdHI) is the 
first order expansion in /, it is valid for small values of /, i.e., under the assumption that a 
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line is unlikely to be present in a given burst. The small / approximation in eq. (19) is valid 
only for / <^ l/MiyLo \ lr,HI); note that (1 — /)"^ ~ 1 — mf is not accurate for / > 1/m 
even if 1/m is small. However we shall use this approximation to / ~ 1/M(Ld | IdHI). We 
can now marginalize / to obtain 

p(D\HI)=j mj I HIUD I fHI) = ^^^^^ , (21) 

where we set p{D \ fHI) = for / > 1/M{Ld \ IdHI). Using the expression in Paper II for 
the probabihty distribution for / given the new data D (here the absence of a BATSE hne 
detection) we find 

.urn, - 

{2M{LD\lDHI){l-M{LD\lDHI)f) , f < 1/M{Ld \ IdHI) 
[0 , f>l/M(LD\lDHI) 

where we used the prior p{f \ HI) = 1. In both eqs. (21) and (22) we extend 
the approximation in eq. (16) to the regime / ~ 1/M[L£, \ IdHI) where the 
approximation will have broken down. In Figure 2 we compare p{f \ DHI) in eq. (22) to 
p{f I DHI) = (m + 1)(1 — /)™ which results from the absence of a hne detection in m bursts 
in which lines could have been detected with 100% probability (i.e., p(Lo. | l^HI) — 1). As 
can be seen, the small / approximation is accurate to a factor of ~ 2 in normalization and 
extent. Given the uncertainties and other approximations in this analysis, this accuracy is 
sufficient. In part, the small / approximation demonstrates the utility of M[Ld \ IdHI) as 
a diagnostic statistic. 



5. DISCUSSION 

The quantity M{Ld \ IdHI) characterizes the detectabihty of spectral lines in a 
burst database and thus our ability to learn about lines from the database. Primarily, 
M{Lr) I IdHI) is the approximate number of bursts in which lines could be detected. We 
have seen that its inverse is twice p{D \ HI), the likelihood of the hypothesis H and that it 
is the width of the distribution for the line frequency /. Using the burst database described 
in §3 we calculated M^Ld \ IdHI) for the BATSE spectra. Since the burst distribution — the 
frequency of lines of different types and where within the burst they occur — is unknown, we 
made a number of modeling assumptions. These assumptions, along with the supposition 
that lines exist and that the modehng of the BATSE detectors is correct, constitute the 
hypothesis H. First, we assume that the hne frequency / is the same for all bursts, and 
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that a line can occur in any spectrum with equal probability (the second variant of model 1 
in §2). Thus we use eq. (8) to define 7^, i.e., 7^ = 2{th2 — f-bi){te2 — teij/T"^- Second, we 
approximate | hHI), the probability of detecting a line in a spectrum, as 1 for SNRs 
above a threshold value if the low energy cutoff is less than a certain energy. In Paper 111 
we found that this probability p{Li \ UHI) was a function of both the SNR and the burst 
angle, and that the transition from to 1 occurs over a range of SNR. The calculations 
in Paper III assumed a low energy cutoff of 10 keV which is rarely achieved because of 
the SLED electronic artifact which raises the effective cutoff (Band et al. 1992) and the 
gain settings of the SDs. However, in Band et al. (1996, henceforth Paper IV) we showed 
that the detectability of a hue is insensitive to the low energy cutoff as long as sufficient 
continuum is inchided below the line candidate (in the example there 15-20 keV). We have 
been using the two Ginga detections to characterize the unknown line distribution. For 
the line at 21.1 keV in the SI segment of GB 870303 the transition between a detection 
probability of and 1 occurs at a SNR of ~ 2, while for the harmonic lines in GB 880205 at 
19.4 and 38.8 keV the transition occurs at a SNR of ~ 7; in both cases the detectability is 
also angle dependent. However, for greater generality we present in Figure 4 M^Ld \ IdHI) 
for a range of SNRs and low energy cutoffs. As can be seen, M{Lu \ IdHI) ~ 50 for 
detecting hues similar to the one in GB 870303, assuming a low energy cutoff of 15 keV 
will suffice. If the detectability of the GB 880205 lines is dominated by the 38.6 keV line 
(see Figure 1 of Paper III), and thus a low energy cutoff less than 25 keV is necessary, then 
M{L£) I IdHI) ^ 20. It is clear from these curves that despite the large number of bursts 
observed by BATSE in the past 5 years, lines would be detectable in relatively few bursts. 

Only for the rare strong bursts are lines detectable in spectra accumulated by Ginga 
and the BATSE SDs. Since BATSE detects bursts with a much larger detector than the 
SDs, whereas Ginga detected bursts with the same detector which accumulated spectra, 
the BATSE burst database includes a larger fraction of weak bursts. In addition, Ginga 
reported lines at ~ 20 and ~ 40 keV. The Ginga burst detector was sensitive down to 2 keV 
(Murakami et al. 1989) whereas BATSE's E'low is typically ~20 keV. On the other hand, 
each BATSE SD has an area twice that of the Ginga detector. Therefore BATSE is usually 
less sensitive than Ginga to lines below ~ 40 keV, and more sensitive above 40 keV. 

The necessary Ginga data has not yet been extracted to complete the study of the 
consistency between the BATSE and Ginga observations as presented in Paper II. However, 
the small values of M{Ld \ IdHI) for the BATSE bursts and the prehminary value of 
M{Ld I IdHI) ~ 5.4 for a Ginga detection of fines similar to the one in GB 880205 
(Fenimore et al. 1993), indicate that the apparent discrepancy between BATSE and Ginga 
is not severe. For example, the probability that the one detection of a line similar to 
GB 880205 occurred in the Ginga bursts and not in the BATSE data is (Papers I and II) 
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P ~ M Ginga/ Ginga + -^batse) ~ 5.4/(5.4 + 20) ~ 0.2, which is hardly an improbable 
event. 



6. SUMMARY 

We now have a methodology which provides the probability for detecting a line in 
the bursts observed by BATSE. This probability can be used to evaluate the consistency 
between the line detections and nondetections by BATSE and other burst missions, estimate 
the frequency with which lines occur, and identify bursts in which lines are likely to be 
discovered by new search techniques. The new element in the methodology is the weighting 
of the probabilities for detecting a line in each of the spectra spanning the burst which 
can be formed from the SHERB spectra provided by the telemetry. This weighting is 
model-dependent; we explored three models where the line occurrence depends only on the 
time a line persists. 

Implementing this methodology requires parameter values characterizing each 
spectrum in the bursts under consideration. To this end we built a database of the necessary 
information. We have been using this database to identify the bursts in which lines may be 
detected. 

We calculated the number of bursts in which lines of various types, as parameterized 
by the minimum SNR and maximum low energy cutoff necessary for a detection, could 
have been detected if the hues were indeed present. This calculation assumes a small 
line frequency. These quantities are necessary for the probability that no lines would be 
detected in the BATSE data and for the distribution for the line frequency, and therefore 
these numbers are essential for measures of the consistency between the BATSE and Ginga 
line observations. Gm^'a-likc lines can be detected in relatively few of the large number 
of bursts BATSE has observed; for example, lines similar to the GB 880205 pair of lines 
are detectable in only ~ 20 BATSE bursts. Although comparable Ginga data is not yet 
available, the discrepancy between Ginga and BATSE docs not appear to be severe. For 
example, a simple calculation shows that the probability that the GB 880205 line set would 
be detected in a Ginga burst is 20%, which is hardly improbable. 

Finally, to evaluate the significance of line candidates we have adopted the maximum 
likelihood ratio test which is more appropriate than the F-tcst. The F-tcst should be 
used when the uncertainties on the datapoints are unknown. These two tests give similar 
significances when the reduced is of order unity. Indeed, we find little change in the 
significances given by the two tests for the line candidates identified by the visual search of 
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BATSE spectra. 
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A. SIGNIFICANCE STATISTIC 

To evaluate the significance of a given spectral feature we have been using the 
F-test which compares fits with nested models (i.e., one model is a subset of the other). 
Assume that Xi results from fitting a spectrum of N(. channels by a continuum model 

with ri parameters (thus vi = Nc — ri degrees-of-freedom) , and that results from 
fitting the spectrum by a continuum+line(s) model with r2 parameters (z/2 = — r2 
degrees-of-freedom). In the continuum+linc(s) model, the ri continuum parameters are the 
same as for the continuum model; thus an additional Az/ = r2 — ri parameters have been 
added by modeling the line(s). If the continuum model is correct and there are actually no 
lines then the quantity 

is distributed as -F(Az/, 2/2). Consequently P{F > Fq) is the probability of finding F larger 
than or equal to Fq when a continuum+line(s) model is fit to a count spectrum resulting 
from a photon spectrum correctly described by the continuum model. Thus this is the 
probability that the improvement in by adding the additional Au line parameters is a 
fluctuation. 

The F-test we have been using is based on a maximum likelihood ratio test where 
the uncertainties are unknown. The original version defines without uncertainties, 
S'^ — YH=i{yi — rni)'^/Nc where yi is the observed value and rrii is the model value. Then 

^' = (^^) 

is distributed as F(Ai/, z/i) (Eadie et al. 1971, p. 238). Since Nc is large, there is little 
difference between ui and U2. Both Fq and Fi use ratios of a-nd S"^, respectively. Thus the 
F statistic eliminates the effect of a systematic multiplicative error in the uncertainties used 
in or of an unknown constant uncertainty on the datapoints in S'^. This demonstrates 
why the F-test is appropriate for the case where the uncertainties are not known. 

However, we find that the uncertainties in our data result predominantly if not 
exclusively from counting statistics, and consequently the uncertainties are known. Thus 
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we can use the fundamental maximum likelihood ratio test (MLRT) from which the F-test 
is derived. This test states that Ax^ — Xi ~ X2 distributed as x^(Az/) if the continuum 
model is sufficient and no lines are present (Eadie et al. 1971, pp. 230-237). As with the 
F-test, a small value of P(Ax^ > Axq) indicates a small probability that the continuum 
model alone describes the data. In Figure 4 we compare the MLRT to the F-test (using the 
Fq statistic of eq. [Al]) for the same values of Ax^ and different values of the reduced x^, 
xl = X^l^- 

As Figure 4 shows, the two tests give the same values for x^ slightly less than 1, which 
is not surprising since wc expect x^ ~ 1 if our spectral model is correct. A value of 
which differs significantly from 1 may result from an incorrect value for the uncertainties 
used in x^, which the F-test attempts to correct. However, other factors may cause x^ to 
differ from unity, such as an incorrect continuum model and inaccuracies in the detector 
response model and the energy calibration. This is a major reason to favor the MLRT. To 
determine the continuum from which a candidate line deviates, we include all the spectral 
data, including continuum far from the line. However, we do not know the true continuum 
shape, which might raise xt- Also, the F-test depends on the total number of datapoints 
(the F statistic has a distribution which is a function of the number of degrees-of-freedom) . 
On the other hand, the MLRT is a function of the number of added parameters Ai/. 

In most cases the MLRT and the F-test will lead to the same conclusion as to 
whether a feature is significant. Indeed, evaluating the line candidates from the visual 
search of BATSE SD spectra with the MLRT as opposed to the F-test (which was used in 
Paper IV) does not lead to a qualitative difference in significance. Figure 5 compares the 
probabilities given by these two tests. As concluded in Paper IV, none of the fine candidates 
is significant. 

B. Notation 

The following is a list of the symbols used in this paper. 
6, c — constants used in modeling g{tb,te). 

Ci — total counts over energy range AF in the ith spectrum. 
Xi — ^the x^ statistic for the ith spectral fit. 

D — the observations, specifically whether or not lines were detected in a burst 
database. 
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Ax^ — the difference in between continuum and continuum+line(s) fits to a 
spectrum with a candidate hne feature 

A£' — width of the energy range over which the SNR is measured. 

Aij — accumulation time of the ith spectrum. 

Ai/ — number of parameters added by modehng a hne 

-E'low — low energy edge of the usable energy range. 

/ — the frequency with which a line type is present in any burst. The use of / assumes 
that each burst, regardless of its characteristics, has the same probability of hosting 
the line type. 

g{tb,te) — probability that the line is present between tb and t^. 

7i — the factor in | fHI) — jif in the small / approximation. 

H — hypothesis about the presence of lines and the operation of the BATSE and/or 
Ginga detectors. 

/ — the proposition representing our understanding of the burst detector and other 
information known or assumed about the burst. 

la — the proposition that a line is present in the crth burst. 

li — the proposition that a line is present in the ith spectrum. An index specifying the 
burst is suppressed. 

— the proposition that a line is detected in the crth burst. 

Li — the proposition that a line is detected in the ith spectrum. An index specifying 
the burst is suppressed. 

M{Ld\ IdHI) — the sum of the probabilities for each burst that a hne would be 
detected in the burst if the line is present, calculated in the small / approximation. 

^Gingcv ^^batse — value of M[Ld \ IdHI) for the Ginga and BATSE burst databases, 
respectively. 

rid — number of line detections in a burst database 

N , — number of SHERB spectra spanning the crth burst. 
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Nb — number of bursts in a burst database. 

Nc — number of channels in a spectrum. 

i/i — number of degrees-of-freedom in the ith spectral fit. 

p{D I fHI) — the hkelihood of H and /, the probabihty of observing D given 
hypothesis H and line frequency /. 

p{D I HI) — the likelihood of i7, the probability of observing D given hypothesis H. 

p{f I HI) — "prior" for the line frequency, the probability distribution for / given H 
and /, but without knowledge of D. 

p{f I DHI) — the probability distribution for / based on the observations D given H 
and /. 

p{L I fHI) — p{Lcr I fHI) — the probabihty of detecting a line in the crth burst 
assuming a line frequency /. 

p{L„ I laHI) — the probability of detecting a line somewhere in the crth burst given 
that it is indeed present. 

p{Li I fHI) — the probability of detecting a line in the ith spectrum assuming a line 
frequency /. 

p{Li I UHI) — the probability of detecting a line in the ith burst given that it is indeed 
present. 

p(Lj I liHI) — the probability of a false positive, i.e., of detecting a line in the ith burst 
when none is present. 

p{li I fHI) — the probability that the line is in the ith spectrum assuming a line 
frequency /. 

p{h I la) — the probability that if a line is present in the ath burst, it is in the ith 
spectrum of that burst. 

Rb — background count rate in a detector over energy range A£^. 
Ti — number of parameters in the ith spectral fit. 
ti, — time line first becomes apparent. 
te — time line is last apparent. 
T — burst duration. 
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Fig. 1. — Cumulative distribution of bursts by maximum signal-to-noise ratio (SNR). Each 
burst is characterized by the maximum value of the SNR in the range 25-35 keV for any 
spectrum formed from consecutive SHERB spectra. The curves are for different maximum 
values of a spectrum's effective low energy edge, £^iow 

Fig. 2. — The distribution for the line frequency /, p{f \ HI). In the first case (sohd curve), 
the sum of the probabihties for each burst that a line would be detected if present is 
M(L£) I IdHI) = 40. This case is calculated in the approximation that / is small, which 
breaks down for / ~ 1/M{Ld \ IdHI). In the second case (dashed curve) there are m — 40 
bursts in which lines are always detectable. 

Fig. 3. — M{Ld I IdHI) as a function of the threshold signal-to-noise ratio (SNR) for different 
maximum low energy cutoffs. A line is detectable in a spectrum if the spectrum's SNR in 
the 25-35 keV band exceeds the threshold and the low energy cutoff Eio^ is less than the 
maximum value labeling each curve. M{Ld \ IdHI) is a measure of: the number of bursts 
in which lines could have been detected; the inverse of the likelihood; and the width of the 
distribution for the line frequency. 

Fig. 4. — A comparison between significance given by the maximum likelihood ratio test 
(MLRT — solid curve) and the F-test (dashed curves) as a function of Ax^. A BATSE line 
candidate scenario was used: a 4 parameter continuum model, a 3 parameter line model, 
and 200 datapoints. The F-test is shown for the labeled values of the reduced x^. 

Fig. 5. — Comparison between the significances given by the maximum likelihood ratio test 
(MLRT) and the F-test for the line candidates identified by the visual search of BATSE 
spectra (Band et al. 1996). 
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